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L EX IN G TON 


M A S S A C H U S E T T S 


Abstract 


Three  Input  Buffer  designs  which  provide  intermediate  storage 
between  the  Analog -to- Digital  Converters  and  the  Digital  Matched  Filter  of 
a Radar  Signal  Processor  are  presented.  All  designs  use  a basic  buffer 
module  having  a selectable  number  of  input  channels  and  data  storage  for- 
mats, and  differing  from  one  another  in  terms  of  the  number  of  input 
channels  and/or  levels  of  buffering. 

Prototype  hardware  experiments  have  been  conducted  on  the  speed- 
constraining  portions  of  an  all-ECL  buffer  module.  Results  indicate  that 
the  use  of  8:1  input  data  demultiplexing  together  with  ECL  lOK  technology 
will  yield  buffer  input  rates  up  to  142  Ms/s  per  channel,  and  output  rates 
of  45  Ms/s  per  rail  to  the  digital  convolver.  The  discussion  also  con- 
siders the  use  of  ECL  lOOK  technology  to  achieve  input  speeds  of  approxi- 
mately 190  Ms/s. 
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The  structure  of  the  signal  processor  as  it  is  presently  envisioned 
is  shown  in  Figure  2.  The  analog-to-digital  converters  will  be  pairs  of 
10-bit  60  Ms/s  devices  which  deliver  parallel  channels  of  22  bit  words  (10  bits 
plus  parity  for  both  I and  Q)  to  the  IB.* 

Three  designs  are  described  for  the  IB.  All  designs  are  based  upon 
a basic  buffer  module.  In  each  case,  the  IB  input  data  rate  is  at  least 
60  Ms/s  per  channel,  and  the  output  rate  is  30  Ms/s  per  rail. 

The  Digital  Convolver  System  provides  matched  filtering  of  the 
received  radar  signal  stored  in  the  IB.  This  subsystem  is  based  on  a radix  4, 
16384-point  pipelined  FFT,  clocked  at  30  t4tiz.  The  DCS  performs  filtering  by 
Fourier  transforming  the  data,  multiplying  in  the  frequency  domain  by  the 
selected  reference  function,  and  then  using  the  same  hardware  switched  to 
perform  the  inverse  transform.  The  output  is  thus  the  filtered  version  of 
the  input  data. 

The  Detection  Processor  follows  the  DCS  and  compares  the  filtered 
data  with  a predetermined  threshold.  Both  data  and  results  of  the  comparison 
are  then  passed  to  the  Output  Buffer  subsystem. 

Subsequent  to  filtering  and  thresholding,  a traditional  set  of  post 
processing  operations,  such  as  monopulse  calculations,  range  and  velocity 
estimation,  bulk  filtering  and  others  is  required  prior  to  further  processing 
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Input  Buffer  word  lengths  will  be  later  shown  as  24  bits,  which  permits 
the  eventual  use  of  11-bit  A/D's. 
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within  the  DPS.  The  Post  Processor  performs  these  functions,  and  is  envisioned 
as  a high-performance  programmable  machine  whose  architecture  is  tailored  for 
radar  post  processing  algorithms.  Alteration  of  algorithms  may  be  done  in 
software. 

The  three  buffer  memory  subsystems  shown  in  Figure  2 allow  computation 
to  proceed  at  differing  rates  within  the  Digital  Convolver,  Post  Processor 

I 

and  Data  Processing  System.  An  important  point  to  note  is  that  the  amount  of 
time  consumed  at  each  stage  must  not  exceed  the  radar  interpulse  period. 

While  the  IB  collects  raw  data  for  eventual  processing  within  the  convolver, 
the  convolver  may  be  operating  upon  IB  data  already  in  storage,  passing  the 
results  to  the  Output  Buffer.  Similarly,  the  Post  Processor  can  be  taking 
data  from  the  Output  Buffer  and  placing  it  in  the  Report  Buffer.  Thus,  the 
three  major  stages  of  the  processor  {the  A/D's,  the  DCS  and  the  Post  Processor) 
may  be  operating  on  individual  and  separate  data  sets.  The  IB  is  the  first 
buffer  in  this  pipeline  flow  and  must  be  capable  of  smoothing  the  entire  radar 
data  stream  so  that  it  can  be  evenly  processed  through  the  signal  processor. 

In  the  following  sections,  designs  for  the  IB  are  presented  in 
detail.  In  Section  II,  an  overview  of  the  basic  buffer  module  is  presented. 
This  module  is  used  in  the  three  system  configurations  which  are  discussed  in 
Section  III.  The  detailed  design  of  the  memory  module  is  described  in 
Section  IV,  and  hardware  prototype  efforts,  which  establish  the  anticipated 
speed  limits,  are  discussed  in  Section  V.  Section  VI  presents  a summary  and 


conclusions. 


II. 


BUFFER  MODULE  OVERVIEW 


Because  the  IB  serves  as  the  intermediary  between  the  60  Ms/s  A/D 
converters  and  the  30  Ms/s  DCS,  it  must  provide  input  and  output  data  rates 
equal  to  or  greater  than  60  Ms/s  and  30  Ms/s,  respectively.  The  size  or 
storage  capacity  of  the  IB  is  a function  of  the  radar  operating  modes  and 
transform  sizes  anticipated.  In  the  signal  processor,  the  matched  filter 
(DCS)  performs  16K,  8K  and  4K  point  convolution.  This  limits  the  time  band- 
width product  of  the  incoming  baseband  signal  to  less  than  16K  and  places  a 
bound  on  minimum  IB  size  at  16K  words.  As  a result,  the  IB  design  is 
centered  around  a basic  16K  module. 

Though  the  final  choice  on  number  of  input  channels  may  differ,  let  us 
assume  for  purposes  of  this  discussion  that  the  IB  may  have  up  to  four  channels. 
Moreover,  the  designs  presented  here  consider  double-or  triple-buffered  operation 


for  16K,  8K  and  4K  transforms.  Beyond  the  buffering  function,  the  IB  must  also 
reformat  stored  data  and  present  it  in  the  appropriate  order  for  the  convolver. 


A final  distinction  must  be  made  between  the  non-overlapped,  multiple-buffered 
modes  in  which  single  data  batches  are  processed  and  a separate  class  of  modes 
for  all-range  processing  in  which  stored  batches  of  data  are  overlapped  when 


presented  to  the  convolver.  When  operating  in  the  overlapped  configuration, 
it  should  be  possible  for  the  IB  to  accept  and  deliver  data  continuously. 

A simplified  block  diagram  of  the  basic  16K  x 24  half-duplex  buffer  module 


is  shown  in  Figure  3.  This  structure  always  provides  a 4-rail  output  to  the 


convolver.  The  design  presented  here  permits  the  selection  of  1,  2,  or  4 active 
input  channels  to  the  basic  16K  module.  In  Figure  3 and  subsequently  throughout 
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ADDRESS  CONTROL 


Simplified  block  diagram,  half  duplex  buffer  module 


this  report,  data  are  labelled  in  accordance  with  the  associated  radar  channel 
of  origin.  Unprimed  data  points  originate  on  channel  1,  singly  primed  data  is 
associated  with  channel  2,  etc. 

The  remainder  of  this  section  introduces  the  basic  16K  x 24  buffer 
module  in  terms  of  its  9 selectable  configurations.  The  manner  in  which  these 
modules  may  be  assembled  to  form  complete  multi-buffered  systems  capable  of 
providing  output  data  in  either  overlapped  or  non-overlapped  format  is  discussed 
in  Section  III.  The  ability  to  select  1,  2 or  4 active  input  channels  per 
module  makes  it  generally  useful  for  building  up  larger  buffer  memory  systems. 

A detailed  hardware  description  of  the  16K  x 24  buffer  module  will  be 
presented  later  in  this  report  (Section  IV).  However,  let  us  first  characterize 
module  operation  in  terms  of  the  number  of  active  input  channels  (1,  2 or  4) 
together  with  the  necessary  data  formatting  for  performing  4K,  8K  or  16K  trans- 
forms within  the  DCS.  Nine  canonic  module  configurations  are  described  in  Figures  4 
and  5 in  terms  of  input  and  output  data  streams.  The  use  of  individual  modules 
in  their  two  or  four  input  channel  configurations  is  of  interest  when  either  the 
utmost  input  speed  is  not  a requirement,  or  when  the  number  of  modules  per  system 
is  limited.  In  the  buffer  memory  designs  to  be  later  described,  all  modules  are 
employed  in  their  high  speed,  single  input  channel  configuration. 

The  first  three  module  configurations  apply  to  single  input  channel 
operation  and  provide  correct  output  data  formatting  for  a single  16K  transform, 
two  8K  transforms,  or  four  4K  transforms,  respectively. 

Configurations  4 through  6 of  Figure  4 are  for  two-channel  operation. 
Configuration  4 describes  the  collection  of  two  8K-point  data  sets  on  separate 
channels,  and  subsequent  memory  readout  in  "packed”  format  for  16K  transform 
processing. 
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118-2-131961 

Fig.  4.  Memory  module  input  and  output  data  streams  for 
Configurations  1 to  6. 
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Configuration  5 is  arranged  such  that  successive  8K  transforms  may  be  performed 
on  data  associated  with  two  independent  channels,  while  configuration  6 provides 
proper  data  ordering  for  four  successive  4K  transforms.  The  8K  data  samples  on 
each  input  channel  need  not  necessarily  be  viewed  as  8K  contiguous  points,  but 
could  instead  actually  be  two  concatenated  4K  sets  per  channel. 

Module  configurations  7 through  9 of  Figure  5 accommodate  four-channel 
data  inputs,  and  again  provide  the  necessary  outputs  for  calculating  16K,  8K 
or  4K  transforms.  Configuration  7 has  additional  utility  in  that  output  data 
remains  in  exactly  the  same  order  in  which  it  was  collected,  and  may,  therefore, 
be  reintroduced  into  the  buffer  input  for  either  playback  operation  or  reformatting. 
In  playback  modes  the  IB  data  is  recorded  and  played  back  in  non-real  time  for 
processing  by  the  DCS. 

It  is  approximately  true  that  the  maximum  sample  rate  at  which  data 
may  be  stored  in  an  individual  buffer  module  does  not  vary  as  the  number  of 
active  input  channels  is  altered.  That  is,  single  channel  usage  of  a given 
buffer  module  permits  data  rates  (per  channel)  which  are  twice  those  for 
2-channel  operation  and  quadruple  those  for  4-channel  operation. 

The  operation  of  several  memory  modules  within  a multi-buffered 
memory  system  implies  the  use  of  independent,  asynchronous  controllers. 

The  discussion  to  follow  presents  three  representative  IB  designs,  differing 
from  one  another  according  to  depth  of  buffering,  number  of  channels,  and 
whether  or  not  output  data  sets  may  be  overlapped  for  "all-range"  processing. 
Consideration  is  also  given  to  questions  of  parity  and  command  formats. 
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Fig.  5.  Memory  module  input  and  output  data  streams  for 
Configurations  7 to  9. 


III. 


INPUT  BUFFER  DESIGNS 


The  basic  16K  x 24  reconfigurable  module  of  Section  II  may  be 
employed  to  construct  multi-channeled,  multi- leveled  buffer  memories  in 
many  ways.  The  discussion  below  describes  three  representative  systems. 

The  first.  System  I,  is  a four-channel,  double-buffered  structure  having 
non-overlapped  data  access.  The  second.  System  2,  is  a three-channeled 
buffer  featuring  triple-buffering  and  the  ability  to  simultaneously  accom- 
modate an  auxiliary  channel.  System  3 is  a special  configuration  of 
System  2 which  illustrates  some  ways  in  which  a flexible  memory  structure 
may  be  employed  to  store  data  continuously  while  delivering  overlapped  output 
data  sets.  Finally,  consideration  will  be  given  to  questions  of  parity 
checking  and  memory  system  control  formats.  All  system  designs  employ 
single-rail  module  inputs,  hence  only  module  configurations  1,  2 and  3 of 
Section  II  will  be  employed. 

A.  Double-Buffered  System 

The  double-buffered  structure  shown  in  Figure  6 allows  full  four- 
channel  input  capability  together  with  high  input  speed.  All  eight  buffer 
modules  use  single-channel  inputs  which  will  support  data  rates  beyond 
60  Ms/s.  They  are  connected  to  four  high-performance  A/D  converter  pairs  via 
an  input  multiplexer.  Correspondingly,  an  output  multiplexer  collects  4-rail 
data  from  any  double-buffered  module  pair  and  routes  it  to  the  DCS.  Dynamic 
data  steering  is  assumed  for  all  multiplexers  along  with  asynchronous  control 
of  buffer  levels  A and  B,  i.e.,  both  half-duplex  memory  banks.  Figure  6 also 
shows  parity  generation  and  detection  blocks,  which  will  be  described  later  in 
this  Section. 
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Figures  7 and  8 show  data  streams  and  module  activity  for  three  typical 
operating  modes,  and  provide  some  insight  into  the  nature  of  the  data  flow,  the 
allocation  of  time  for  collecting  and  dispensing  data,  the  necessary  data 
steering  and  duty  cycles  for  controllers  A and  B,  which  each  drive  four  modules. 

The  time  lines  shown  are  for  illustrative  purposes,  and  should  not  be  strictly 
interpreted.  It  is  generally  true  that  for  the  actual  system  data  sets  will  not 
be  supplied  contiguously  to  the  DCS  by  the  IB. 

Mode  1 permits  the  collection  of  32K  data  points  on  two  input  channels, 
and  the  subsequent  delivery  of  eight  8K  transform  data  sets  to  the  convolver  as 
shown  in  Figure  7(a).  Channel  3 and  4 A/D's  are  inactive,  and  the  8K  data  sets 
are  interleaved  (though  they  need  not  be)  in  terms  of  their  input  channel  of 
origin.  Figure  8(a)  shows  the  time  line  approximately  to  scale,  indicating  that 
T^,  the  total  time  to  collect  64K  points  across  two  channels,  is  equal  to  T^, 
the  total  time  required  for  the  convolver  to  absorb  that  same  data  when  formatted 
as  shown  in  Figure  7(a).  The  underlying  assumptions  here  are  dedicated  A and  B 
controllers  which  require  that  all  A-level  write  activity  be  completed  before 
the  controller  roles  are  reversed  and  level  B is  written  while  level  A is  read 
asynchronously.  Of  course,  if  one  were  willing  to  dedicate  additional  controllers, 
the  data  stored  in  modules  lA  and  2A  could  be  accessed  earlier.  System  2 later 
considers  the  case  of  total  non-sharing  of  controllers. 

In  mode  2 (Figures  7(b)  and  8(b)),  the  collection  of  data  is  much  the 
same  way  as  in  mode  1,  though  each  internal  module  configuration  is  changed 
from  2 to  1 for  proper  formatting  upon  output.  Input  and  output  time  lines 
remain  unchanged,  while  buffer  module  output  rail  switching  via  the  output 


multiplexer  is  necessary. 
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Fig.  8.  System  1 memory  module  activity  for  modes  1,  2 and  3 


Mode  3 (Figures  7(c)  and  8(c))  has  the  special  property  of  delivering 
output  data  in  exactly  the  same  order  in  which  it  was  stored.  This  feature  is 
useful  for  playback  or  non-real  time  DCS  processing. 

B.  Triple- Buffered  System 

Figure  9 shows  a three- channeled  input  buffer  design  (System  2)  which 
utilizes  nine  16K  x 24  buffer  modules  for  triple-buffering.  By  the  addition 
of  individual  module  controllers  and  increased  parallelism  in  both  input  and 
output  data  steering.  System  2 not  only  provides  triple-buffered,  non-overlapped 
performance,  but  meets  the  requirements  for  overlapped  data  access,  as  we  will 
see  in  the  next  section. 

The  same  memory  speed  conventions  apply  to  System  2 as  for  System  1. 
Referring  to  Figure  9,  the  general  notion  of  employing  input  and  output  multi- 
plexers still  applies,  although  the  design  now  includes  an  auxiliary  channel. 

With  appropriate  control  to  avoid  contention  for  modules.  System  2 can  support 
simultaneous  traffic  from  two  input  or  output  devices.  The  multiplexers  shown 
in  Figure  9 should  be  considered  "virtual  multiplexers"  in  the  sense  that  data 
buses  could  be  used  to  minimize  the  hardware  size  of  such  a structure.  With 
respect  to  control,  individual  controllers  mean  increased  generality.  For  the 
non-overlapped,  triple-buffered  cases  to  be  discussed  here,  however,  it  is  useful 
to  consider  all  controllers  associated  with  levels  A,  B and  C to  be  slaved 
together  in  groups  of  3 modules. 
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four-rail  convolver  imposes  one  additional  requirement  on  the  structure  of 
Figure  9.  When  packing  three-channel  data,  one  DCS  input  rail  must  be  inhibited 
or  "zero  padded".  The  "ZP"  terminal  on  the  output  multiplexer  of  Figure  9 
exists  for  this  reason.  Figures  10  and  11  show  data  streams  and  module  activity 
for  three  typical  operating  modes,  including  a packed-data  case. 

Mode  1 is  the  collection  of  48K  data  points  on  a single  input  channel, 
and  the  subsequent  delivery  of  three  16K  transform  data  sets  to  the  convolver. 
Only  the  channel  1 A/D  is  active  and,  as  shown  in  Figure  11,  strict  non- 
overlapping of  data,  starting  with  an  empty  buffer  as  shown,  implies  that  no 
level  A data  may  be  offloaded  until  all  level  A writing  has  ceased.  Of  course, 
any  stored  data  in  levels  B or  C could  be  read  concurrently  with  the  filling  of 
level  A.  Note  that  readout  to  the  convolver  requires  half  the  data  acquisition 
time. 

Mode  2 is  the  first  of  two  three-channel  cases,  and  uses  module  con- 
figuration 2 to  format  six  8K  transform  data  sets.  Although  the  8K  sets  are 
ordered  sequentially  according  to  input  channel,  consecutive  8K  sets/channel  are 
easily  obtainable.  No  module  rail  switching  is  required. 

Mode  3 is  a three-channel,  packed  data  mode  which  supplies  four  16K 
transform  data  sets,  and  uses  the  zero  pad  feature  discussed  earlier. 
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(b)  MODE  2 

THREE  CHANNELS.  16K  POINTS/CHANNEL  IN;  SIX  8K  TRANSFORM  DATA  SETS  OUT 
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4K  -8K1 
4K'-8K  V 
4K '-8K  1" 
. ZEROES 


8K  -12K-1 
8K'-12K  r 
8K"-*12K-1" 

. ZEROES  . 


12K-16K  1 
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(c)  MODE  3 

THREE  CHANNELS,  16K  POINTS/CHANNEL  IN;  FOUR  PACKED  16K  TRANSFORM  DATA  SETS  OUT 


Fig.  10.  System  2 data  streams  for  3 typical  modes. 
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Fig.  11.  System  2 memory  module  activity  for  modes  1,  2 and  .3. 
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C.  Multi-Overlapped  System 

Beyond  the  multi-buffered  memory  modes  described  above,  there 
exists  an  application  wherein  the  IB  is  used  as  a continuous  data  recorder 
at  the  system  A/D  converter  rate,  while  simultaneously  providing  overlapped 
sets  of  the  output  data  for  all -range  processing,  in  contrast  to  designs  for 
non- overlapped  data  sets  discussed  earlier.  Two  alterations  in  System  2 of 
Figure  9 would  permit  all-range  processing.  First,  we  must  relax  the  some- 
what artificial  constraint  of  slaving  module  controllers,  and  allow  them  to 
run  independently  (though  in  many  instances  they  need  not  all  be  autonomous). 
Second,  we  must  either  build  the  buffer  memory  system  with  hardware  which 
will  support  single-channel  input  rates  in  the  vicinity  of  120  Ms/s,  or 
permit  the  DCS  to  stop  and  wait  when  IB  data  is  not  ready. 

The  selectable  overlap  feature  may  be  realized  by  storage  of  certain 
data  segments  in  more  than  one  of  the  three  modules  shown  in  Figure  12. 

Figure  13  shows  the  required  module  read  and  write  intervals  for  25%  overlap  and 
50%  overlap  cases.  Some  fundamental  issues  relating  to  equilibrium  speeds 
for  continuous  throughput,  amount  of  data  overlap  and  multiplicity  of  memory 
modules  require  discussion.  First,  let  us  assume  that  the  IB  output  data 
rate  (i.e.,  the  DCS  input  rate)  will  be  a constant  120  Ms/s,  which 
is  equivalently  30  Ms/s  on  each  of  4 rails.  From  Figure  13,  it  can  be  seen 
that  some  overlap  sizes  (>50%)  lead  to  efficient  utilization  of  all  modules 
fall  busy  all  the  time),  whereas  overlaps  of  less  than  50%  have  intervals 
during  which  no  activity  occurs.  Such  gaps  must  nonetheless  be  present  to 
maintain  appropriate  redundant  writing  and  continuous  output  streams. 
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It  should  also  be  apparent  that  overlapping  of  output  data  while  main- 
taining continuous  throughput  requires  that  the  input  be  slowed  by  some  factor 
related  to  the  amount  of  overlap.  We  define  the  amount  of  overlap  here  to  be 
the  number  of  like  data  points  stored  redundantly  in  any  two  modules.  As  the 
amount  of  overlap  increases,  the  input  speed  must  be  further  slowed.  Specific 
examples  of  data  formatted  for  4K,  8K  or  16K  transforms,  and  having  various 
amounts  of  overlap,  are  shown  in  Figure  14.  Data  storage  for  these  three  trans- 
form sizes  requires  that  individual  memory  modules  be  operated  in  configurations 
1,  2 and  3,  respectively.  Full  utilization  of  available  memory  storage  capacity 
is  required  when  storing  data  for  16K  transforms,  whereas  8K  and  4K  transforms 
require  50%  and  25%  capacity,  respectively. 

The  maximum  amount  of  overlap  attainable  is  a direct  function  of  the 
number  of  memory  modules  constituting  the  all  range  buffer  configuration.  While 
Figure  12  shows  a module  count  of  3,  which  is  the  minimum  number  required,  we 
could  employ  as  many  modules  as  are  within  the  system,  provided  that  performance 
would  improve  and  that  the  necessary  data  and  control  paths  existed.  For  any 
given  module,  let  the  following  definitions  apply: 

^W  = 

'Tr  = 

L 

N 


write  time  per  data  set 
read  time  per  data  set 

, . , , number  of  points  common  to  two  modules 

normalized  overlap  = f --7 — 

number  of  data  points  per  data  set 

number  of  buffer  modules 


24 


dk 


m 


INPUT  DATA  PATTERN 


OUTPUT  DATA  PATTERN 


18-2-13201 1 


0-4K  T4K-8K  1,8K-12K'1,  •••  0 -IK  1,2K^3K-1,4K-5K  1.6K--7K1,  ... 

1K-2K-1.3K-4K  1,5K-6K  1.7K-8K-1,  ... 

2K-3K-1,4K-*5K-1,6K-7K1,8K-9K  1,  ... 

3K-4K-1,5K-6K.1,7K-8K  1,9K-10K-1,  ... 

(a)  4K  TRANSFORMS,  50%  (2K>  OVERLAP 


INPUT  DATA  PATTERN  OUTPUT  DATA  PATTERN 

0-8K-1,8K-16K-1,16K-24K-1-..  0 -2K  1,5K  -7K  1,  10K-12K-1,15K-17K.1,  .. 

2K-4K-1.7K  -9K-1,  12K-14K-1,17K-19K1,  .. 
4K-6K-1.9K  -11K  1,14K-16K'1,19K-21K  1,  •• 
6K-8K-1,nK-13K-1,16K->18K.1,21K-23K  1,  •• 

(b)  8K  TRANSFORMS,  37’/2%  (3K)  OVERLAP 


INPUT  DATA  PATTERN 


OUTPUT  DATA  PATTERN 


a-16K-l,16K-32K  l,32K-48K.1,  ... 


0 -4K-1,  12K-16K-1,24K-28K-1,36K-40K-1, 

4K  -8K-1,  16K-20K1,28K-32K-1,40K-44K-1, 
8K  -12K-1.20K-24K  1,32K-36K-1.44K-48K1, 
12K-16K-1,24K-28K-1,36K-40K-1,48K-52K1, 


(c)  16K  TRANSFORMS,  25%  (4K)  OVERLAP 

Fig.  14.  Representative  data  patterns  for  a 3-module,  single-input, 
all  range  buffer  configuration. 


25 


A simple  examination  of  several  timing  cases  such  as  those  shown  in 
Figure  13  indicates  the  following: 


= 1 / (N-1)  at  maximum  overlap 


L = CN-2)  / (N-1) 
max  j I y ) 


Equation  (1)  indicates  that  module  read  and  write  time  intervals  are 
directly  related  to  the  amount  of  overlap,  and  that  smaller  overlaps  permit 
memory  write  speeds  to  approach  read  speeds.  A limit  is  set  on  writing  speed 
under  maximum  overlap  conditions  as  shown  in  Equation  (2)  based  on  the  number 
of  modules  in  use.  Equation  (3)  stems  from  (1)  and  (2),  and  relates  maximum 
overlap  to  module  count.  Figure  15  summarizes  the  above  properties,  and  shows 
that  extending  the  module  count  beyond  3 for  any  transform  size  extends  the 
amount  of  achievable  overlap,  but  with  commensurate  slowing  of  the  input  data 


rate. 


From  a radar  standpoint,  the  transform  size  and  time-bandwidth  product 


determine  the  required  amount  of  data  overlap.  Fortunately,  from  a hardware 
standpoint,  one  would  ordinarily  expect  that  all-range  processing  entails  the 
collection  of  data  sets  having  a size  much  larger  than  their  associated  TW 
product,  meaning  that  the  normalized  overlap  would  be  small.  This  fits  nicely 
into  the  scenario  of  Figure  15,  where  small  overlaps  permit  a small  module 
count  (usually  3),  and  the  smallest  input  rate  decrease. 
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D.  Parity 

Having  provided  an  overview  of  the  basic  buffer  module  and  the  ways 
it  can  be  assembled  into  complete  multi-channeled  buffer  memories,  we  now 
consider  the  issues  of  parity  and  command/control  formats  prior  to  detailed 
consideration  of  buffer  module  structure  and  operation. 

Parity  check  capability  for  semiconductor  memory  modules  of  the  size 
and  speed  considered  here  is  recommended  practice.  In  general,  this  involves 
the  appending  of  an  extra  redundant  binary  bit  onto  each  incoming  data  word  to 
form  an  augmented  word  whose  total  number  of  I's  is  always  even  or  odd.  Odd 
parity  is  preferred  for  two  reasons.  First,  the  zero  bit  usually  represents  the 
absence  of  a signal;  hence,  the  use  of  odd  parity  ensures  that  some  signal  will 
be  obtained,  and  second,  for  even  word  lengths,  odd  parity  will  detect  the 
situation  where  all  I's  or  O's  have  occurred  due  to  malfunction. 

It  is  important  to  note  also  that  although  a simple  parity  check  will 
detect  a single  error  in  a group  of  digits  of  any  size,  the  probability  of  a 
double  error  goes  up  as  the  length  of  the  checked  group  increases.  Thus, 
whether  we  check  our  module  words  en  masse  or  test  smaller  groups  of  bits,  some 
optimum  span  of  bits  must  be  chosen  which  is  consistent  with  an  allowable 
error  limit. 

More  specifically,  we  have  chosen  to  size  the  IB  designs  presented 
here  so  that  11-bit  com])lex  data  (11  hits  real,  11  bits  imaginaryl 
could  be  accommodated,  with  1 parity  bit  to  be  allocated  for  each  11-bit  byte. 
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Both  Figures  6 and  9 illustrate  the  positioning  of  parity  generation  and  parity  1 

I 

checking  logic.  Note  that  the  parity  logic  spans  not  only  the  storage  modules,  | 

\ 

but  the  input  and  output  multiplexers  as  well.  | 

i 

j 

i 

E.  Command  Formats  | 

The  command/control  issues  For  a buffer  memory  system,  such  as  i 

System  2 (a  .S-channcl,  triple-buffered  system),  shown  in  Figure  9,  are  intro- 
duced here  in  a general  way.  Analysis  in  depth  of  the  requisite  IB  control  states 
is  an  appropriate  part  of  a detailed  design,  but  beyond  the  scope  of  this  report. 

Control  of  the  complete  System  2 can  be  described  in  terms  of  a 
central  command  buffer  which  passes  arguments  to  nine  individual  module  con- 
trollers, each  capable  of  asynchronous  read  or  write  operations  with  respect  to 
other  modules,  and  which  retains  other  parameters  associated  with  logic  mutually 
shared  by  all  modules  (e.g.,  the  steering  of  input  and  output  devices). 

Three  kinds  of  command/control  signals  are  envisioned  here,  and  are 
termed  1)  initialization  levels,  2)  flags  and  3)  triggers.  Initialization 
levels  are  supplied  to  the  IB  prior  to  the  beginning  of  memory  data  transfer. 

Triggers  and  flags,  on  the  other  hand,  are  initiators  and  indicators  of  events 
occurring  during  IB  operation.  A list  of  possible  command/control  signals  is 
given  below  in  Table  1,  together  with  brief  references  to  their  function. 

Figure  9 shows  conceptually  the  interconnection  paths  among  theA/l)s, 

IB,  DCS  and  auxiliary  devices.  This  arrangement  permits  easy  module  reassign- 
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TABLE  1 


COMMAND/CONTROL  SIGNALS  FOR  A 
3'CHANNEL  TRIPLE-BUFFERED  MEMORY  SYSTEM 


INITIALIZATION  LEVI-ILS  » BITS 

Buffer  Channel  Select  2 

Buffer  Level  Select  2 

Input  Device  Select  2 

Transform  Size  2 

Output  Device  Select  2 

Pack  Select  ) 

Data  Suppress,  Start  Pt , |:nd  Pt  (1,4,4) 


Ucad /Write  1 

Start  Pt,  lind  Pt  , n Data  Sets  (4,4,2) 

[■LA(LS  « BITS 

IB  Output  Ready  1 

IB  Ininit  Ready  1 

TRICCLRS  » BITS 

IB  Output  Request  1 

(a)  DCS  Input  Request 

(b)  Aux  Input  Request 


IB  Input  Request 

(a)  A/I)  Start 

(b)  Aux  Output  Request 


1 


DESCRIPTION 

Selects  active  IB  module(s) 
across  all  channels 

Selects  active  IB  module(s) 
by  channel 

Selects  A/D  or  Auxiliary  inputs 

4K,  8K,  16K  information  used 
chiefly  for  A/D  and  DCS  trans- 
fers 

Selects  DCS  or  Auxiliary  outputs 

Refers  to  IB  outputs  only, 
alters  module  output  steering; 
doesn't  change  output  device 

Used  to  blank  fourth  DCS  input 
channel  when  packing,  address 
to  nearest  IK,  for  IB  reads 
only 

Programs  memory  module(s)  to 
read  or  write  when  triggered 

Specifies  start  and  end  addresses 
to  within  nearest  IKth  inter- 
val of  total  module  address  space 

DESCRIPTION 

Indicates  to  output  device  that 
IB  is  ready  to  deliver  data, 
but  module  isn't  necessarily  full 

Indicates  to  input  device  that 
IB  is  ready  to  accept  data,  hut 
module  isn't  necessarily  empty 

DESCRIPTION 


Requests  IB+DCS  data  transfer 

Requests  IB»Aux  data  transfer 
(Data  reformat  required) 


A/I>IB  transfers  if  IB  ready; 
else  command  is  aborted 

When  IB  is  ready,  Aux  Output 
Req  begins  Aux  device+IB  trans- 
fer (data  reformat  required) 
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merit,  and  supports  up  to  two  simultaneous  asynchronous  reads  or  writes  per 
channel.  Recall  that  the  31  bits  of  initialization  previously  listed  are  to 
be  relayed  to  all  9 modules  in  turn.  Thus,  for  concatenating  operations  on 
successive  data  sets,  additional  initialization  data  must  first  be  routed  through 
the  central  command  buffer  and  then  be  queued  in  fast  stacks  within  the 
appropriate  module  controllers.  This  permits  real-time  reconfiguration  at  the 
module  level. 

For  most  of  the  control  states  outlined  here,  the  Signal  Processor 
Control  (see  Figure  2),  which  is  well  up  in  the  control  hierarchy,  mediates 
any  contention  among  devices  vying  for  IB  access.  If  we  say  that  the 
A/D  to  IB  to  DCS  path  constitutes  the  primary  data  channel,  then  the  Signal 
Processor  Control  will  decide  when  auxiliary  devices  may  have  access  to  the 


IV. 


DETAILED  MEMORY  MODULE  DESCRIPTION 


A 1-bit  slice  of  the  basic  buffer  module,  described  in  Section  II, 
is  shown  in  Figure  16  below.  Recalling  that  either  one,  two  or  four  input 
channels  may  be  active,  the  front  end  functions  as  a data  rate  reducer  prior 
to  depositing  data  within  the  memory  array  via  the  input  steering  section. 
Alterable  input  and  output  data  paths  permit  stored  data  to  be  retrieved  from 
the  memory  array  in  several  different  ways.  Independent  addressing  and 
control  states  are  required  for  modules  being  written  while  others  are  being 
read;  i.e.,  independent  controllers  are  required  for  asynchronous  operation. 

For  the  large  multi-buffered  structures  described  earlier,  address/control 
logic  can  be  shared  among  those  modules  which  are  either  read  or  written  at  the 
same  time. 

Some  conventions  are  necessary  with  respect  to  input  data  rates  as  a 
function  of  the  number  of  active  input  channels.  Let  us  define  f ^ and  f^ 
as  the  input  sample  (word)  rates  for  one,  two  and  four  channel  operation, 
respectively.  Based  upon  hardware  experimentation  to  be  discussed  later,  the 
expected  values  of  f^,  f^  and  f^  are  190  Ms/s,  95  Ms/s  and  47  megasamples/sec. 

One  and  two  channel  rates  in  excess  of  60  Ms/s  permit  the  buffer 
system  to  keep  pace  with  a planned  60  Ms/s  A/D  converter.  The  four-channel  con- 
figuration of  the  basic  module  would  constrain  the  input  rate  to  47  Ms/s  per 
channel.  However,  any  realistic  system  would  use  a combination  of  four  to 
eight  modules,  each  with  only  one  (or  at  most  two)  active  inputs,  thus 
accommodating  input  rates  of  at  least  95  Ms/s.  The  experimentally  anticipated 
rate  of  f^  = 190  Ms/s  is  clearl)-  the  most  stressing,  and,  to  the  extent  that  the 
proposed  buffer  hardware  can  support  it,  future  A/D  converter  rates  up  to  190  Ms/s 
could  be  accommodated.  The  confidence  level  in  values  for  f,„^^  will  be  discussed 
further  in  connection  with  experimental  hardware  results. 
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A. 


Front  Knd 


y 


The  IB  front  end  is  simply  a scrial-to-paral Icl  converter  followed 
by  a holding  register  as  shown  in  Figure  17.  Data  multiplexers  provide  a 
choice  between  serial  or  parallel  operation,  along  with  serial  data  entry  point 
selection. 

For  single-rail  inputs  (module  configurations  1,  2,  3 of  Section  IIJ, 
the  front  end  is  configured  as  a single  8:1  serial-to-parallel  converter  with 
output  register,  and  delivers  8-bit  words  at  f^/8  megawords/sec.  to  the  input 
data  steering  section.  Input  data  rates  of  fj^>60  Ms/s  represent  the  most 
stressful  case  in  terms  of  hardware  speed,  and  will  be  discussed  in  detail. 

For  the  moment,  then,  let  us  consider  Figure  17  to  be  a conceptual  model  only. 
Figure  18  is  a front  end  timing  diagram  for  configurations  1,  2 and  3,  wherein 
the  converter  clock  (SCLK)  has  a periodicity  of  1/fj  ns.  All  data  are  numbered 
according  to  order  of  occurrence.  As  indicated  by  the  diagram,  serial  data 
are  transferred  upward  in  the  converter  at  the  SCLK  rate;  then  transferred  to 

the  output  register  under  parallel  clock  (PCI.KJ  control. 

Two  input  rails  are  used  for  configurations  1 and  5,  and  accordingl}' 
the  front  end  is  arranged  as  a pair  of  4:1  scri al -to-paral lei  converters  with 
. ; parallel  output  registers.  On  a pcr-rail  basis,  the  input  data  rate  is  now 

I 

j reduced  to  f^  Nlwords/sec.  ; i.e.,  SCLK  has  a periodicity  of  1/f.,  ns.  However, 

■ ‘ the  net  front  end  data  throughput  rate  becomes  (2  channels)  x (f.,  Mwords/sec/ 

► channel)  = 2f.,  Mwords/sec.  Figure  19  shows  the  timing  relationship  for  con- 

I 

' , figurations  4,  5 and  6.  Note  that  all  data  arc  labelled  to  denote  their  sc- 

'.  I ^ 

f ■ cjuential  nature  as  well  as  input  channel  rail  of  origin, 

t • 
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For  configurations  7 through  9,  the  front  end  must  accept  data  on 
4 input  rails,  each  operating  at  a rate  of  f^  words/sec.  Four  2:1  serial-to- 
parallel  converters  perform  ttie  required  data  rate  reduction  prior  to  being 
strobed  into  the  output  register,  which  is  updated  every  2/f^  nsec.  Figure  20 
depicts  the  front  end  timing  for  configurations  7,  8 and  9. 

B.  Input  Data  Steering 

The  input  data  steering  section  is  simply  a set  of  multiplexers  which 
are  interposed  between  the  front  end  data  register  and  the  memory  array.  IvTiile 
several  mode-dependent  steering  options  exist,  all  are  consistent  with  left 
bank/right  bank,  partial  left  bank/right  bank,  or  upper  bank/lower  bank  data 
routing  strategies.  Figure  21  denotes  the  available  data  paths  between  the 
front  end  and  the  storage  array.  Control  word  C7...C2  determines  the  steering 
format.  As  shown  in  Figure  22(a)  and  (b) , buffer  configurations  1,  2,  3,  5 
and  6 require  dynamic  control  for  alternate  column  interchanging  within  either 
the  left  or  right  bank  of  the  memory  array.  Configurations  8 and  9 do  not 
require  the  transposing  of  upper  and  lower  bank  rows,  but  do  require  the  d>'namic 
end-for-end  swapping  of  data  within  each  row  of  the  memory  array  as  shown  in 
Figures  22(d)  and  (e).  Configuration  4 utilizes  static  data  steering 
(Figure  22(c)),  using  what  might  he  termed  an  interleaved  left  bank/right  bank 
format.  It  should  be  noted  that  bank  selection  per  se  is  a function  of  memory 
chip  select  logic.  For  example,  while  RAM  0 and  RAM  8 may  simultaneously 
receive  data  word  from  the  front  end  (Figure  22(a)),  only  one  of  the  two 
memory  elements  will  be  written  at  any  given  time. 
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Fig.  22.  Storage  array  input  data  connections. 
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C.  Memory  Array 

The  memory  array  is  simply  a collection  of  IK  x 1 bipolar  random-access 
integrated  circuits,  organized  on  a per-bit  basis  as  a 4 x 4 device  array.  As 
shown  in  Figure  22,  data  access  for  writing  can  be  either  on  a left  bank/right 
bank  or  upper  bank/ lower  bank  basis.  When  reading  the  array,  access  is  made  to 
groupings  of  four  devices  as  indicated  in  Figure  23.  A thorough  examination  of 
the  necessary  memory  array  control  states  (e.g.,  chip  select,  page  select, 
page  address,  write  enable")  for  the  9 buffer  module  configurations  will  be  made 
in  a later  section. 
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D.  Output  Data  Steering 

The  output  data  steering  section  has  been  designed  with  particular 
emphasis  upon  hardware  simplicity  and  speed.  Figure  23  denotes  the  grouping 
of  semiconductor  memory  devices  such  that  all  necessary  intergroup  connections 
may  be  realized  using  4:1  multiplexers.  For  all  configurations,  the  positioning 
of  data  within  the  basic  16-element  memory  array  places  the  burden  of  hardware 
complexity  upon  the  input  section,  the  payoff  being  that  pipelined  addressing 
may  be  used  across  each  of  the  4-Ry\M  groups  of  Figure  23  to  enhance  buffer  memory 
output  speed.  As  will  be  later  described,  the  wire-ORing  of  memory  devices 
simplifies  the  explicit  data  steering  problem,  but  requires  that  individual 
RAM  contributions  to  group  buses  through  N^)  be  mutually  exclusive. 

Pipelined  addressing  in  itself  implies  the  existence  of  an  address  delay 
register  stack,  together  with  non-overlapping  chip  selects. 


i\ 


i 

i 
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E. 


Address/Control 


A synchronous  binary  counter  forms  the  basis  for  deriving  all  address 
and  control  states.  Figure  24  is  a simplified  block  diagram  which  introduces 
the  write/read  command  (W/R) , and  a 4-bit  binary  coded  decimal  (BCD)  con- 
figuration select  word,  M.  For  writing  or  reading,  some  configurations  require 
different  counter  speeds  than  others.  Memory  addressing  is  shown  in  terms  of 
chip  select,  page  select  and  page  address  sections.  For  writing,  chip  select 
performs  a bank  select  function,  and  enables  the  write  pulse  at  appropriate 
RAMS.  When  reading,  non-overlapping  groups  (4)  of  chip  selects  make  fast 
memory  output  rates  possible  via  address  pipelining.  For  all  configurations 
except  3,  each  RAM  is  envisioned  as  having  four  pages  (mode  3 utilizes  eight 
"half  pages").  The  strategy  for  choosing  both  write  and  read  paging  algorithms 
is  based  upon  the  need  ultimately  to  read  out  cont iguous  data  from  adjacent 
RAMS  sequentially,  while  avoiding  memorx-  contention.  As  shown  in  Figure  24, 
read  addresses  are  offset  in  time  using  a register  stack  (i.e.,  pipelined)  before 
being  applied  to  the  memory  array.  On-page  addressing  is  binary  sequential. 
Tables  2 and  3 below  summarize  the  sjiecific  counter  states  for  each  con- 
figuration .* 


* This  report  does  not  jnirport  to  describe  every  nuance  o'  the  control  iirohlcm. 
Issues  such  as  optimal  phasing  of  data  arriv.il  for  reliable  input  registering, 
or  offset  delays  to  correlate  output  steering  control  with  d.ita  delayed  due  to 
RAM  access  times  should  be  addressed  by  the  final  hardware  designers.  The 
availability  of  counter  states  and  static  mode-select  -tates  should  make  the 
generation  of  such  status  flags  as  "full".  "1'2  full",  etc.  very 

St ra i ght  forwa  rd . 
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TABLE  3 

BUFFER  MODULE  CONTROL  STATES  FOR  READING 

* ADDRESS  STATES  F’RIOR  TO  STACK  DELAYS 
**  CONFIGURATION  3 HAS  UNIQUE  8 -PAGE  ORGANIZATION 
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F.  Buffer  Module  Configurations 

This  section  completes  the  discussion  of  the  basic  buffer  module 
structure  by  describing  the  nature  of  the  data  flow  for  all  9 configurations 
with  the  aid  of  memory  load  maps  as  summarized  in  Section  II.  Configuration  1 
is  described  in  extra  detail  since  it  is  representative  of  the  configurations 
used  within  the  systems  described  in  Section  III.  This  discussion  includes 
data  stream  and  timing  diagrams  which  are  representative  of  the  control  and 

data  activity  found  in  other  configurations. 

Configuration  1 

Configuration  1 requires  that  the  buffer  memory  front  end  operate  at 
the  single-rail  input  data  rate  (f^=l/T^),  and  supply  an  8-bit  datum  to  the 
input  steering  section  every  8T^  nanoseconds.  Figure  25(a)  describes  the  front 
end  data  flow.  The  entire  left  bank  (RAMS  0 through  7)  is  loaded  before  filling 
the  right  bank.  As  shown  in  Figure  25(b),  an  input  steering  change  is  required 
every  4096  T^  ns.  Address  space  for  the  16  RAMS  which  comprise  each  1-bit  memory 
slice  is  divided  into  four  pages.  Figure  25(c)  relates  page  selection  during 
writing  to  input  data  activity.  Within  a given  page,  addressing  is  always  binary 
sequential.  Basic  timing  relationships  for  memory  write  activity  are  shown  in 
Figure  26.  The  time  interval  shown  is  that  for  the  first  two  write  operations 
(i.e.,  the  loading  of  the  first  16  serial  data  points)  within  the  left  bank, 
and  presupposes  that  the  first  front  end  S/P  conversion  has  preceded  the  first 
write  pulse. 

Figure  27  is  the  memory  load  map  for  configuration  1 and  describes  tlie 
way  in  which  16, .584  sequential  data  points  (words)  reside  in  memory.  The  memory 
map  notion  becomes  especially  useful  when  examining  the  addressing  and  control 
needed  for  reading. 
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ivluMi  reading  the  memory  in  configuration  1,  tlic  address/control  counter 
is  slowed  liy  a factor  of  four  and  a dola\-  register  stack  is  introduced  to  ])ei'- 
init  address  ])ipelining  foi-  im]iroved  speed.  While  jnige  selection  clianges  after 
every  2048  data  points  when  writing,  read  operations  require  jnige  switching  on 
alternate  RAM  grciqi  accesses  as  shown  in  figure  28  1a).  'I'hese  same  groups  of 
4 R.\M.S  (the  four  RAM  columns  of  IMgure  2?)  receive  the  above-mentioned  stack- 
delayed  addresses  and  provide  group  access  rates  mucii  higher  than  those  ]iossihle 
using  non-pipel  ined  individual  R/V^IS.  [Mgure  2S(I))  indicates  tliat  while  data 
points  0-s.^  are  being  read  t'rom  RAM  group  (HA,  data  jtoints  4k->-4K+,A  are  lieing  read 
from  R.AM  group  4s-7,  etc.  Ihe  timing  diagram  of  I'igure  29  sliows  tlie  rclat  ionsliip 
between  the  delayed  addresses  used  within  a given  RA.^1  group  and  the  corresponding 
family  of  non-overlapping  ciiip  select  pulses.  The  "data  valid"  window,  while 
somewhat  overs imjil i fi ed , nevertheless  convevs  the  idea  that  each  RAM  within  a 
group  contrilnites  to  the  memory  output  2.S°s  of  the  time.  Strictly  speaking,  this 
window  is  narrowed  due  to  the  cliip  select  access  time  re(|ui  rement . 

figure  29  also  indicates  that  the  addressing  within  a ]iage  is  altered 
at  half  the  rate  at  which  page  selection  changes.  Consider,  for  example,  the 
first  two  accesses  of  RAM  0:  I'rom  Figures  1 and  2"’.  it  is  evident  that  RAM  0 

will  supjily  data  ]ioint  0 for  rail  1)^  when  first  accessed,  and  then  will  proidde 
data  point  4k+4  on  output  rail  wlien  accessed  a second  time.  But  data  points 

0 and  IKi-4  are  the  first  entries  on  pages  0 and  2 of  RAM  0.  hence  the  on-]iage 
address  needn't  change  when  page  selection  changes  from  0 to  2. 
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Fig.  28.  Configuration  1 data  streams,  reading. 
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* the  shaded  region.'^  of  the  address  waveforms  denote 
RAM  "data  valid"  windows. 


Figures  28(b)  and  (c)  describe  the  result  of  output  data  steering. 

In  configuration  1,  the  outputs  of  RAM  groups  0>.^  and  4-*-7,  together  with  8-+1 1 
and  12->-15  are  interchanged  on  alternate  group  accesses  to  form  streams  of  con- 

t t f 

tiguous  data  on  output  channel  through 

Configurations  2,5 

The  differences  between  configurations  1 and  2 are  summarized  in 
Tables  2 and  3 and  occur  primarily  in  the  areas  of  input  steering,  chip  selection 
and  page  selection.  The  load  map  for  configuration  2 is  shown  in  Figure  30. 
Addressing  and  control  has  been  defined  to  provide  proper  output  formatting 
for  two  consecutive  8K  transforms. 

Configuration  3 requires  the  same  high-speed  front  end  performance  as 
configurations  1 and  2 above,  but  differs  in  terms  of  input  steering,  chip 
selection  and  page  selection.  Unlike  all  other  configurations,  number  3 also 
utilizes  an  8-page  address  organization  (Figure  31);  a consequence  of  requiring 
output  data  sets  having  as  few  as  4K  points.  If  desired,  three  additional  4K 
sets  may  be  stored  with  no  modification  of  addressing  and  control.  In  such 
cases,  the  availability  of  status  flags  such  as  "1/4  full,"  "1/2  full,"  etc., 
is  highly  desirable. 

Configurations  4,5,6 

Configuration  4 is  the  first  of  three  requiring  two-channel  inputs  and 
a front  end  configured  as  a pair  of  4:1  S/P  converters.  As  noted  in  Table  3, 
the  address/control  counter  is  now  scaled  by  a factor  of  2 (double  the  speed  of 
configurations  1>3)  when  reading.  Incoming  8K  data  sets  on  channels  D1  and 
UI'  are  packed  such  that  a single  16K  transform  may  he  performed  (sec  Figure  4). 
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Fig.  31,  Configuration  3 memory  load  map 
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Unlike  configurations  5 and  6,  configuration  4 employs  static  input  and  output 
data  steering,  using  RAMS  0-*-3  for  storage  associated  with  output  rail  D^, 

RAMS  4->-7  for  etc.  and  only  8 RAMS  are  simultaneously  written.  The  memory 

load  map  for  configuration  4 is  shown  in  Figure  52. 

Module  configuration  5 permits  consecutive  8K  transforms  to  be 
performed  on  two  data  sets  which  were  stored  simultaneously.  Unlike  configura- 
tion 4,  dynamic  data  steering  is  needed  to  gain  access  to  2K-point  data  blocks 
when  reading  the  buffer.  Figure  35  shows  the  configuration  5 memory  load  map. 

Configuration  6 operation  is  similar  to  that  of  configuration  .3,  e.xcept 
that  input  and  output  steering  must  switch  at  a higher  rate  to  deposit  IK-point 
data  segments  in  memory.  Figure  34  shows  how  each  IK-point  segment  is  dis- 
tributed across  4 RAMS  of  a group  (column)  for  eventual  sequential  read  out. 

Configurations  7,8,9 

Module  configuration  7 is  the  first  of  three  remaining  configurations 
wherein  4-channel  input  data  and  4-channel  output  data  are  transferred  at  rates 
no  less  than  the  30Ms/s  convolver  rate.  The  front  end  for  this  class  of  con- 
figurations is  arranged  as  four  2:1  serial-to-paral lei  converters,  and  data  is 
alternately  deposited  in  the  upper  and  lower  banks  of  the  memory  array,  unlike 
the  left  bank-right  bank  approach  of  configurations  1 through  6.  This  require- 
ment arises  because  data  associated  with  all  four  output  rails  must  be  located 
such  that  the  4-RAM  group  pipelined-read  approach  used  in  all  other  configurations 
may  be  retained.  As  indicated  earlier,  we  trade  some  additional  input  circuit 
complexity  for  output  circuit  simplicity.  That  complexit>-  resides  within  the 
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input  steering  section  of  Figure  21,  which  otherwise  would  use  2:1  multiplexers 
exclusively.  Figure  35  is  the  memory  load  map  for  configuration  7,  and  Tables 
2 and  3 delineate  the  control  states.  Configuration  7 is  the  only  one  in  which 
incoming  data  on  a given  rail  will  ultimately  be  read  out  on  the  corresponding 
output  rail. 

Configurations  8 and  9 permit  an  individual  buffer  module  to  accept 
4K  data  sets  on  each  of  four  input  channels,  and  to  format  the  data  upon 
readout  to  permit  either  two  8K  transforms  or  four  4K  transforms,  respectively. 
These  configurations  store  data  in  exactly  the  same  way  (Figure  36)  and  differ 
only  slightly  in  their  control  states  for  reading,  as  shown  in  Table  3. 


r>2 


1K*2.  1K*6. 

2K-2 

Trr^  ID’, 
5K*2Mk*5' 

lK-2' 

4K-2" 

W»l"\  2K*E 

'' 

3K-2" 

2K*2.  2K*6. 
'5K^ 

3K-2 

, ....  4K  2' 

2',  6',  10' 

, lK-2' 

1K*2’",  lKt6"‘ 1 

2K-2"'j 

V. 


HARDWARE 


Having  outlined  the  essentials  of  the  basic  buffer  module,  it  remains 
to  determine  the  performance  limits  (principally  speed)  using  available, 
off-the-shelf  components.  The  following  discussion  on  hardware  examines  three 
high  performance  subsections  of  the  buffer  module. 

Worst-case  analysis  and  experimental  results  are  included  for  these 
selected  portions  assuming  an  all-ECL  memory  realization.  The  hardware  dis- 
cussion concentrates  on  those  subsystems  within  which  there  exists  most  techno- 
logical risk  (speed  uncertainty);  namely,  the  front  end,  memory  array  and  output 
steering. 

A.  Front  End 

This  section  describes  a baseline  ECL  lOK  front  end  whose  input 
speed  can  keep  pace  with  the  A/D  convcicers  to  be  used  in  the  signal  processor 
system.  An  alternative  ECL  lOOK  design  permitting  higher  IB  input  speeds 
is  also  discussed. 

The  ECL  lOK  front  end  logic  diagram  is  shown  in  Figure  ^7.  Unlike 
the  conceptual  diagram  of  Figure  17,  wc  now  include  specific  devices  by  generic 
part  designation.  The  front  end  is  a programmable  ser i al-to-paral lei  con- 
verter, followed  by  a parallel  holding  register,  which  accepts  differential 
inputs  from  the  A/D  converters  associated  with  up  to  four  radar  cliannels.  Tlie 
ECL  10000  4-bit  shift  register  forms  the  basis  for  the  design.  For  either 
one  or  two-channel  motles,  each  4 bit  regisi'er  '■c  epts  serial  inputs  via  lOl^.S 
multiplexers,  but  when  four  channels  arc  active,  the  parallel  entry  feature  is 


Potential  front  end  speed  is  directly  related  to  the  timing  require- 
ments of  the  10000  and  10176  registers.  In  the  steady  state,  the  maximum 
data  transfer  rate  is  a function  of  register  set-up  and  hold  times,  plus 
allowances  for  signal  degradation  along  with  clock  and  control  skews.  Signal 
degradation  refers  to  less- than- ideal  (but  acceptable)  waveform  quality  such 
as  attenuation  or  slowed  edge  speeds  resulting  from  operation  at  high  rates 
with  moderate  fan-outs.  A clock  skew  allowance  is  made  in  recognition  of  the 
uncertainty  of  precise  clock  arrival  times  throughout  a system  of  moderate  sice. 

The  calculation  of  expected  speed  in  logic  systems  is  strongly  in- 
fluenced by  the  level  of  designer  conservatism  conditioned  by  prior  experience. 

In  the  timing  budget  tallied  below,  front  end  speed  capability  is  conserva- 
tively rated  while  meeting  minimum  peit'ormance  requirements  with  high  confidence. 

TABLE  4 

DELAY  TOTALS  FOR  SPEED-CONSTRAINING  FRONT  END  PATHS 


HIOOO 

t iming 

U)  1 7^ 

t iming 

Defini t ions 

SO 

= 1.5ns 

■’■si) 

= 2.5  ns 

T^jj  = register  input 

data  set-up  time 
prior  to  clock 

SKEW 

= 2.5 

'^SKEW 

= 2.5 

T\.,.^,.  = time  allowance 
SKEVS  - , 

for  skew 

DEG 

= 2.0 

* di;g 

= 2.0 

T _ = time  allowance 
Did'  vw-aveform 

degradation 

H 

= 1.0 

■>H 

= 1 .0 

ijj  = register  input 
data  hold  time 
after  clock 

7.0  ns 

8.5  ns 
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From  Table  4,  10176  set-up  and  hold  times  limit  the  expected  front 
end  data  rate  to  ~117  Ms/s. 

A 1-bit  microstrip-interconnected  front  end  has  been  built  and  tested. 
Several  combinations  of  10000  and  10176  devices  were  operated  while  maintaining 
the  timing  budgets  of  Table  4.  Observed  input  data  rates  were  not  less  than 
125  Ms/s. 

It  should  be  noted  that  the  10176  could  certainly  be  replaced  by  the 
faster  but  less  dense  10000  part,  which  would  increase  expected  front  end  speed 
to  142  Ms/s.  However,  additional  device  and  packaging  expense  would  be  incurred. 

The  IB  can  have  a single-channel  input  rate  in  excess  of 
142  MHz  by  using  an  ECL  lOOK  front  end  such  as  the  one  shown  in  Figure  38. 

Using  delay  budget  assumptions  similar  to  those  of  Table  4 but  for  100141 
devices,  the  ECL  lOOK  front  end  should  accommodate  input  rates  in  the 
250-300  Ms/s  range.  These  rates  exceed  the  capability  of  the  lOK  memory  array 
described  in  Section  IV. 

B.  Memory  Array 

This  design  assumes  the  use  of  the  ECL  10415A  1024  by  1-bit  IC.  When 

reading  the  buffer,  the  10415A's  worst-case  chip  select  access  time  of  10  ns 
is  of  particular  significance.  Timing  relationships  for  writing  into  the 
buffer  are  shown  in  Figure  39.  In  the  steady  state,  memory  write  cycle  time 
is  determined  principally  by  address  set-up  time,  write  pulse  width,  chip 
select  hold  time,  and  an  allci-^nce  for  signal  condition  and  skew.  Budgeting 
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Fig.  39.  ECL  1041.SA  RAM  write  timing  diagram. 


4 ns  for  the  latter  and  using  worst-case  10415A  parameters,  the  basic  memory 
write  cycle  time  becomes  42  ns,  which  translates  into  maximum  attainable  one, 
two  and  four-channel  input  rates  of  190,  95  and  47  Ms/s,  respectively. 

figure  40  compares  the  memory  array  writing  speed  as  a function  of  the 
number  of  input  channels  with  the  performance  obtainable  using  any  of  three 
front  ends.  Points  1 and  2 represent  the  4-channel  and  2-channel  input  cases, 
where  speed  is  constrained  by  memory,  and  hence  any  front  end  will  suffice. 

For  single  channel  operation,  the  ECL  lOK  front  ends  limit  input  rates  at 
117  and  142  Ms/s,  whereas  at  190  Ms/s,  the  lOOK  front  end  would  be  required. 

In  this  case,  the  performance  limit  would  be  determined  by  the  memory  array. 


C.  Output  Steering 
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Pipelined  addressing  is  employed  on  4-R.4M  groups  to  achieve  buffer 
read  speeds  higher  than  those  possible  using  conventional  methods.  Figure  41 
depicts  a 4-RAM  group  featuring  4-phase  addressing,  non-overlapping  chip 
, selects  and  wire-OR'd  outputs.  Basically,  given  the  sequential  nature  of 

stored  data,  we  take  advantage  of  the  relatively  fast  10415A  chip  select 
access  time  (compared  to  address  access)  and  commutate  from  RAM  A to  RAM  D. 

j ' The  4:1  data  multiplexers  of  Figure  23  have  been  supplanted  by  9:1  multiplexers, 

« 

as  would  be  needed  in  System  2.  The  somewhat  irregular  9:1  mux  can  be  built 

I . 

using  a 10164  8:1  multiplexer  with  a simple  AND  gate,  and  its  delays  arc  those 
of  the  10164.  Figure  42  shows  the  relationship  between  RAM  A and  RAM  B signals, 
i which  also  exists  between  RAMS  B and  C,  ('  and  D,  and  D and  A 
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The  delay  budget  for  determining  worst- case  buffer  access  time  is 
given  in  Table  5 below,  and  includes  the  usual  device  and  interconnection  delay 
allowances.  An  additional  1 ns  penalty  is  assessed  for  increased  propagation 
delay  when  wire-ORing  RAM  outputs.  A relatively  narrow  time  window  exists 
for  which  data  appearing  at  the  10164  output  is  valid  and  may  be  registered 
by  the  10176.  Based  on  worst-case  calculations,  the  expected  buffer  memory 
read  rate  is  42.5  Ms/s. 


TABLE  5 

DELAY  TOTALS  FOR  SPEED-CONSTRAINING  OUTPUT  STEERING  PATH* 


10415A 

T 

ACS 

10 

^WTRE-OR 

1 

10164 

T 

PD 

4.5 

10176 

T 

SU 

2.5 

^SKEW 

2 

^PATH 

2 

10176  Tj^  1.5 

25.5  ns 

An  experimental  1-bit  slice  of  memory  and  output  steering  logic  has 
been  built  and  tested.  Maintaining  worst-case  timing  margins  wherever  pos- 
sible, achievable  data  rates  were  typically  45  Ms/s. 

*10415A  TM  . = TRCS  . = 0 assumed 

min  min 


A preliminary  sizing  of  System  2 (the  9-module  system)  has  been  done 
to  gain  some  notion  of  total  circuit  count,  the  physical  volume,  and  power 
dissipation.  The  results  are  summarized  in  Table  6. 

TABLE  6 

PRELIMINARY  SIZING  OF  SYSTEM  2 


# 


ITEM 

IC's 

1 module 

1009 

9 modules 

9081 

parity,  selectors 

646 

IB  totals 

9727 

Assumptions : 

• standard  backplane/circuit  card 

•forced  air  cooling 

•high  efficiency  switching  power 
2 

•85in  printed  circuit  cards 
•lOin^/IC  for  total  cabinetry* 


This  figure  includes  power  supplies  and 
DCS  packaging  density. 


ffCKT 

CARDS 

POWER 
(WATTS  AC) 

VOLUME 

(FT^) 

10 

565 

5.85 

90 

5085 

52.65 

7 

275 

3.7 

97 

5360 

56.4 

packaging 


supplies 


air  ducting,  and  is  consistent  with 
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The  expected  replication  of  like  kinds  suggests  the  use  of  printed 
circuit  cards  for  all  but  one-of-a-kind  control  cards,  which  should  be  wire- 
wrapped.  In  anticipation  of  a high  number  of  address/control  lines  throughout 
the  system,  all  backplanes  are  likely  to  be  of  the  multilayer  printed  circuit 
type. 


The  Input  Buffer  (IDJ  module  designs  described  in  this  report  pro- 
vide the  first  level  buffering  for  the  Signal  Processor  discussed  in  Section  I 
A basic  buffer  module  is  defined  which  has  a selectable  number  of  input 
channels  and  data  storage  formats.  A module  delivers  four  parallel  outputs  t ■ 
the  DCS. 


Using  the  basic  building  block  in  its  single  input  configuration, 
three  buffer  memory  designs  are  presented.  The  first  uses  double-buffcri  . 
modules  on  each  of  four  input  channels.  The  second  has  triple-buffered 
modules  on  each  of  three  input  channels.  The  third  system  is  a vari.it  i n. 
the  second  system  which  allows  the  multiple,  overlapped  data  collecti-m  >. 
in  all  range  processing. 

Prototype  hardware  e.xperiments  have  been  conducted  to  dctcrmin.  • 
speed  limits  within  the  module  front  end,  memory  array  and  output  d.it  t 
steering  sections.  Results  indicate  that  a front  end  based  on  the  d< 
type  10176  holding  register  can  support  input  rates  of  12.‘i  Ms/s  nrr 
A second  ECL  lOK  design  using  the  device  type  10000  holding  rc.-!  t 
integration  and  hence  a larger  systcml  should  operate  at  141  'I  ' 

front  end  using  device  type  100141  should  permit  input  s]iecd  - in  ■ 

250  Ms/s.  However,  e.xperimental  results  show  tliat  the  fCI  1“^  , , 

given  that  the  front  end  provides  8:1  data  rate  reduction  i .s  uf  i*'  . .V^is  bit 
are  written  simultaneously),  limits  at  100  Ms/s.  i ijeref  ■ 1 1- , the  lu  ory  .irr,i>- 
constrains  input  speed  at  100  M-  s when  .8:1  deiv.i  1 1 i p 1 ■ v i is  emploved. 


Pipelined  addressing  on  4-RAM  groups  within  the  memory  array  permits 
buffer  output  rates  greater  than  those  using  conventional  addressing.  Experi- 
mental results  indicate  typical  output  rates  of  45  Ms/s  per  DCS  rail. 

Lastly,  it  must  be  emphasized  that  the  designs  presented  here  include 
many  options  and  many  alternative  configurations.  Any  actual  IB  would  select 
only  a subset  of  the  design  alternatives  discussed  in  this  report. 
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